Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat:Add /v2/embed endpoint for text and image embeddings in Cohere API #52

Merged
merged 1 commit into from
Sep 18, 2024

Conversation

HavenDV
Copy link
Contributor

@HavenDV HavenDV commented Sep 18, 2024

Summary by CodeRabbit

  • New Features

    • Introduced a new /v2/embed API endpoint for obtaining text embeddings, supporting both text and image inputs.
    • Added functionality for asynchronous text embedding through the Embedv2Async method.
    • New classes and structures for managing embedding requests and responses, including support for dynamic properties.
  • Documentation

    • Updated OpenAPI specification to include the new embedding endpoint and its schemas.
  • Bug Fixes

    • Enhanced error handling for embedding requests and responses to improve reliability.

Copy link

coderabbitai bot commented Sep 18, 2024

Walkthrough

The changes introduce new functionality for text and image embeddings in the Cohere API, specifically through the addition of the /v2/embed endpoint. This includes the implementation of methods for embedding requests and responses, along with corresponding data structures. Several new classes and interfaces are created to handle the embedding process, including support for JSON serialization and error handling. The updates enhance the API's capability to process and return embeddings for various applications.

Changes

Files Change Summary
src/libs/Cohere/Generated/Cohere.CohereApi.Embedv2.g.cs Introduced Embedv2Async method for obtaining text embeddings, with overloads for different request types. Added partial methods for request preparation and response processing.
src/libs/Cohere/Generated/Cohere.ICohereApi.Embedv2.g.cs Added interface methods for asynchronous text embedding requests.
src/libs/Cohere/Generated/Cohere.Models.Embedv2Response*.g.cs Defined multiple response classes (Embedv2Response, Embedv2Response10, etc.) to represent embedding operation responses, including properties for main data and additional properties.
src/libs/Cohere/Generated/Cohere.Models.Images.g.cs Created Images class for handling image embeddings, including properties for model, input type, and embedding types.
src/libs/Cohere/Generated/Cohere.Models.Texts.g.cs Introduced Texts class for text embedding requests, with required properties for input texts, model, and input type.
src/libs/Cohere/Generated/Cohere.Models.TextsTruncate.g.cs Defined TextsTruncate enum for managing input truncation options and added extension methods for string conversion.
src/libs/Cohere/Generated/Cohere.Models.V2EmbedRequest.g.cs Created V2EmbedRequest struct for embedding requests, allowing either text or images, with validation logic.
src/libs/Cohere/Generated/JsonConverters.TextsTruncate*.g.cs Added custom JSON converters for TextsTruncate and nullable types to facilitate serialization and deserialization.
src/libs/Cohere/Generated/JsonSerializerContext.g.cs Updated JSON serializer context to include new converters for handling embedding requests and truncation options.
src/libs/Cohere/openapi.yaml Introduced a new /v2/embed endpoint in the OpenAPI specification, detailing request and response structures for embedding operations.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant API
    participant EmbedRequest
    participant EmbedResponse

    User->>API: POST /v2/embed
    API->>EmbedRequest: Create V2EmbedRequest
    EmbedRequest-->>API: Validate request
    API->>EmbedResponse: Process embedding
    EmbedResponse-->>API: Return embedding result
    API-->>User: Send response with embeddings
Loading

🐰 In the meadow, changes bloom bright,
New embeddings take flight,
Texts and images dance in delight,
API grows, a wondrous sight!
With each request, knowledge we glean,
In the world of data, we reign supreme! 🌼


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    -- I pushed a fix in commit <commit_id>, please review it.
    -- Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    -- @coderabbitai generate unit testing code for this file.
    -- @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    -- @coderabbitai generate interesting stats about this repository and render them as a table.
    -- @coderabbitai read src/utils.ts and generate unit testing code.
    -- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    -- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Tip

Early access features: enabled

We are currently testing the following features in early access:

  • OpenAI o1 for code reviews: OpenAI's new o1 model is being tested for code reviews. This model has advanced reasoning capabilities and can provide more nuanced feedback on your code.

Note:

  • You can enable or disable early access features from the CodeRabbit UI or by updating the CodeRabbit configuration file.

@github-actions github-actions bot enabled auto-merge September 18, 2024 18:25
@github-actions github-actions bot merged commit 6b047e5 into main Sep 18, 2024
3 checks passed
@coderabbitai coderabbitai bot changed the title feat:@coderabbitai feat:Add /v2/embed endpoint for text and image embeddings in Cohere API Sep 18, 2024
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 25

Outside diff range and nitpick comments (22)
src/libs/Cohere/Generated/Cohere.Models.Embedv2Response.g.cs (2)

6-8: Add summary for the 'Embedv2Response' class

The XML documentation comments for the Embedv2Response class are empty. Please provide a meaningful summary to enhance code readability and maintainability.


11-13: Provide summary for the 'Data' property

The XML documentation comments for the Data property are empty. Adding a summary will help other developers understand the purpose and usage of this property.

src/libs/Cohere/Generated/Cohere.Models.Embedv2Response2.g.cs (2)

6-8: Add summary for the 'Embedv2Response2' class

The XML documentation comments for the Embedv2Response2 class are empty. Please provide a meaningful summary to enhance code readability and maintainability.


11-13: Provide summary for the 'Data' property

The XML documentation comments for the Data property are empty. Adding a summary will help other developers understand the purpose and usage of this property.

src/libs/Cohere/Generated/Cohere.Models.Embedv2Response3.g.cs (2)

6-8: Add summary for the 'Embedv2Response3' class

The XML documentation comments for the Embedv2Response3 class are empty. Please provide a meaningful summary.


11-13: Provide summary for the 'Data' property

Adding a summary for the Data property will help clarify its purpose.

src/libs/Cohere/Generated/Cohere.Models.Embedv2Response4.g.cs (2)

6-8: Add summary for the 'Embedv2Response4' class

The XML documentation comments are empty for this class. Providing a summary will improve code comprehension.


11-13: Add documentation for the 'Data' property

Please include a summary to describe what the Data property represents.

src/libs/Cohere/Generated/Cohere.Models.Embedv2Response5.g.cs (2)

6-8: Add summary for the 'Embedv2Response5' class

Including a meaningful class summary will assist in understanding its purpose.


11-13: Document the 'Data' property

Provide a summary for the Data property to explain its role.

src/libs/Cohere/Generated/Cohere.Models.Embedv2Response6.g.cs (2)

6-8: Add summary for the 'Embedv2Response6' class

Please add a meaningful summary to the class documentation comments.


11-13: Include documentation for the 'Data' property

A brief summary will help clarify the purpose of the Data property.

src/libs/Cohere/Generated/Cohere.Models.Embedv2Response7.g.cs (1)

6-8: Add XML documentation comments

The XML documentation comments for the class and the Data property are empty. Providing meaningful summaries will improve code readability and maintainability.

Also applies to: 11-13

src/libs/Cohere/Generated/Cohere.Models.Embedv2Response9.g.cs (1)

6-8: Add XML documentation comments

The XML comments for the class and the Data property are missing. Including descriptive summaries will enhance understanding for other developers.

Also applies to: 11-13

src/libs/Cohere/Generated/Cohere.Models.Embedv2Response10.g.cs (1)

6-8: Add XML documentation comments

The class and the Data property lack XML documentation comments. Adding detailed summaries will benefit code maintainability.

Also applies to: 11-13

src/libs/Cohere/Generated/Cohere.Models.Embedv2Response11.g.cs (1)

6-8: Add XML documentation comments

Including meaningful XML documentation for the class and Data property will improve code comprehension and support.

Also applies to: 11-13

src/libs/Cohere/Generated/Cohere.Models.Embedv2Response12.g.cs (1)

6-8: Add XML documentation comments

The absence of XML documentation comments for the class and Data property reduces code clarity. Providing summaries will aid in understanding the code's purpose.

Also applies to: 11-13

src/libs/Cohere/Generated/JsonConverters.TextsTruncate.g.cs (2)

35-35: Unreachable code after the switch statement

The return default; statement at line 35 may be unnecessary since all cases in the switch either return a value or throw an exception. Removing this line could clean up the code.


44-44: Null check on writer parameter

The null check on the writer parameter is good practice; however, since Utf8JsonWriter is typically not null when Write is called, you might consider whether this check is necessary in this context.

src/libs/Cohere/Generated/Cohere.Models.V2EmbedRequest.g.cs (2)

8-165: Add meaningful XML documentation to public members

The public struct V2EmbedRequest and its members have empty XML documentation comments. Providing detailed summaries for the struct, its properties, methods, and constructors will improve code readability and help users understand the API usage.


106-109: Integrate validation within the struct to prevent invalid states

The Validate() method checks if the V2EmbedRequest instance is in a valid state but relies on the caller to invoke it. To prevent misuse, consider enforcing this validation internally, such as within constructors or property setters, to ensure the struct cannot represent an invalid state.

src/libs/Cohere/Generated/Cohere.CohereApi.Embedv2.g.cs (1)

94-97: Preserve the original exception type during error handling

Wrapping the HttpRequestException in an InvalidOperationException could obscure the original exception details and make specific exception handling more difficult. Consider throwing the original HttpRequestException to maintain clarity.

Apply this diff to adjust the exception handling:

 try
 {
     response.EnsureSuccessStatusCode();
 }
 catch (global::System.Net.Http.HttpRequestException ex)
 {
-    throw new global::System.InvalidOperationException(__content, ex);
+    throw;
 }
Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 34957ec and e843e1b.

Files selected for processing (23)
  • src/libs/Cohere/Generated/Cohere.CohereApi.Embedv2.g.cs (1 hunks)
  • src/libs/Cohere/Generated/Cohere.ICohereApi.Embedv2.g.cs (1 hunks)
  • src/libs/Cohere/Generated/Cohere.Models.Embedv2Response.g.cs (1 hunks)
  • src/libs/Cohere/Generated/Cohere.Models.Embedv2Response10.g.cs (1 hunks)
  • src/libs/Cohere/Generated/Cohere.Models.Embedv2Response11.g.cs (1 hunks)
  • src/libs/Cohere/Generated/Cohere.Models.Embedv2Response12.g.cs (1 hunks)
  • src/libs/Cohere/Generated/Cohere.Models.Embedv2Response2.g.cs (1 hunks)
  • src/libs/Cohere/Generated/Cohere.Models.Embedv2Response3.g.cs (1 hunks)
  • src/libs/Cohere/Generated/Cohere.Models.Embedv2Response4.g.cs (1 hunks)
  • src/libs/Cohere/Generated/Cohere.Models.Embedv2Response5.g.cs (1 hunks)
  • src/libs/Cohere/Generated/Cohere.Models.Embedv2Response6.g.cs (1 hunks)
  • src/libs/Cohere/Generated/Cohere.Models.Embedv2Response7.g.cs (1 hunks)
  • src/libs/Cohere/Generated/Cohere.Models.Embedv2Response8.g.cs (1 hunks)
  • src/libs/Cohere/Generated/Cohere.Models.Embedv2Response9.g.cs (1 hunks)
  • src/libs/Cohere/Generated/Cohere.Models.Images.g.cs (1 hunks)
  • src/libs/Cohere/Generated/Cohere.Models.Texts.g.cs (1 hunks)
  • src/libs/Cohere/Generated/Cohere.Models.TextsTruncate.g.cs (1 hunks)
  • src/libs/Cohere/Generated/Cohere.Models.V2EmbedRequest.g.cs (1 hunks)
  • src/libs/Cohere/Generated/JsonConverters.TextsTruncate.g.cs (1 hunks)
  • src/libs/Cohere/Generated/JsonConverters.TextsTruncateNullable.g.cs (1 hunks)
  • src/libs/Cohere/Generated/JsonConverters.V2EmbedRequest.g.cs (1 hunks)
  • src/libs/Cohere/Generated/JsonSerializerContext.g.cs (2 hunks)
  • src/libs/Cohere/openapi.yaml (2 hunks)
Files skipped from review due to trivial changes (1)
  • src/libs/Cohere/Generated/Cohere.Models.Embedv2Response8.g.cs
Additional comments not posted (10)
src/libs/Cohere/Generated/Cohere.Models.Embedv2Response.g.cs (1)

15-15: Verify the data type of 'Data' property

The Data property is defined as string?. Confirm whether this accurately represents the data returned by the embedding operation. If Data is expected to be a collection of embeddings, consider using an appropriate data type like List<float[]> or similar.

src/libs/Cohere/Generated/Cohere.Models.Embedv2Response2.g.cs (1)

15-15: Verify the data type of 'Data' property

The Data property is defined as string?. Confirm whether this accurately represents the data returned by the embedding operation. If Data should contain embedding vectors or other structured data, consider using an appropriate data type.

src/libs/Cohere/Generated/Cohere.Models.Embedv2Response3.g.cs (1)

15-15: Confirm the data type of 'Data' property

Verify that string? is the correct data type for Data. If it should represent complex data such as embedding vectors, an appropriate collection type should be used.

src/libs/Cohere/Generated/Cohere.Models.Embedv2Response4.g.cs (1)

15-15: Check the data type of 'Data' property

Ensure that string? is appropriate for the Data property. If it represents more complex data, adjust the type accordingly.

src/libs/Cohere/Generated/Cohere.Models.Embedv2Response5.g.cs (1)

15-15: Validate the type of 'Data' property

Verify that the Data property should be of type string?. If it holds complex data structures, consider updating the type.

src/libs/Cohere/Generated/Cohere.Models.Embedv2Response6.g.cs (1)

15-15: Confirm 'Data' property data type

Ensure that string? is the correct data type for the Data property based on the API response.

src/libs/Cohere/Generated/JsonConverters.TextsTruncateNullable.g.cs (1)

1-56: LGTM

The TextsTruncateNullableJsonConverter class is correctly implemented, handling both serialization and deserialization of nullable TextsTruncate enums efficiently.

src/libs/Cohere/Generated/Cohere.Models.Images.g.cs (1)

27-38: Verify InputType values for image embeddings

In the InputType property, ensure that the input_type values accurately reflect valid options for image embeddings. Confirm that 'image' is an acceptable value and that other options are applicable in this context.

src/libs/Cohere/Generated/JsonSerializerContext.g.cs (1)

68-69: LGTM

The addition of JSON converters for TextsTruncate, TextsTruncateNullable, and V2EmbedRequest appropriately extends the serialization context to support the new types.

Also applies to: 156-156

src/libs/Cohere/openapi.yaml (1)

10715-10721: Verify the correctness of the discriminator mapping in V2EmbedRequest

In the V2EmbedRequest schema between lines 10715-10721, the discriminator mapping includes input types like search_document, search_query, classification, clustering, and image. However, these input types should correspond accurately to the defined schemas.

Please ensure that the input_type values in the discriminator mapping correctly correspond to the Texts and Images schemas.

/// <summary>
///
/// </summary>
public sealed partial class Embedv2Response
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider consistency in class naming

Ensure that the class name Embedv2Response aligns with existing naming conventions and accurately reflects the class's purpose. If there are multiple response classes for embedding, consider more descriptive names to differentiate them.

/// <summary>
///
/// </summary>
public sealed partial class Embedv2Response2
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarify the need for multiple response classes

Having several classes like Embedv2Response, Embedv2Response2, etc., might cause confusion. Assess whether these classes serve distinct purposes or if they can be consolidated into a single, well-named class.

/// <summary>
///
/// </summary>
public sealed partial class Embedv2Response3
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate class naming concern

Multiple classes with incremental numbering (Embedv2Response3, etc.) may indicate duplicate or placeholder classes. Consider unifying these classes or providing descriptive names to reflect their unique purposes.

/// <summary>
///
/// </summary>
public sealed partial class Embedv2Response4
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Evaluate the necessity of 'Embedv2Response4' class

Repeated classes with similar names may lead to confusion. Assess if this class is necessary or if it can be merged with others.

/// <summary>
///
/// </summary>
public sealed partial class Embedv2Response5
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Address potential class duplication

Having multiple classes like Embedv2Response5 may not be necessary. Review if these classes can be consolidated.

parameters:
- $ref: '#/components/parameters/RequestSource'
requestBody:
description: ''
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Provide a meaningful description for the request body

The requestBody at line 3688 has an empty description. Providing a clear description enhances the API documentation and helps users understand the purpose of the request body.

Apply this diff to add a descriptive message:

        requestBody:
-          description: ''
+          description: 'Embedding request payload containing either text or image inputs.'
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
description: ''
description: 'Embedding request payload containing either text or image inputs.'

/v2/embed:
post:
summary: Embed
description: "This endpoint returns text embeddings. An embedding is a list of floating point numbers that captures semantic information about the text that it represents.\n\nEmbeddings can be used to create text classifiers as well as empower semantic search. To learn more about embeddings, see the embedding page.\n\nIf you want to learn more how to use the embedding model, have a look at the [Semantic Search Guide](/docs/semantic-search)."
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct grammatical error in the endpoint description

The description in line 3683 contains a grammatical error. Specifically, the sentence should include "about" for clarity.

Apply this diff to correct the sentence:

-If you want to learn more how to use the embedding model, have a look at the [Semantic Search Guide](/docs/semantic-search).
+If you want to learn more about how to use the embedding model, have a look at the [Semantic Search Guide](/docs/semantic-search).
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
description: "This endpoint returns text embeddings. An embedding is a list of floating point numbers that captures semantic information about the text that it represents.\n\nEmbeddings can be used to create text classifiers as well as empower semantic search. To learn more about embeddings, see the embedding page.\n\nIf you want to learn more how to use the embedding model, have a look at the [Semantic Search Guide](/docs/semantic-search)."
description: "This endpoint returns text embeddings. An embedding is a list of floating point numbers that captures semantic information about the text that it represents.\n\nEmbeddings can be used to create text classifiers as well as empower semantic search. To learn more about embeddings, see the embedding page.\n\nIf you want to learn more about how to use the embedding model, have a look at the [Semantic Search Guide](/docs/semantic-search)."

properties:
model:
type: string
description: "Defaults to embed-english-v2.0\nThe identifier of the model. Smaller \"light\" models are faster, while larger models will perform better. [Custom models](/docs/training-custom-models) can also be supplied with their full ID.\nAvailable models and corresponding embedding dimensions:\n* `embed-english-v3.0` 1024\n* `embed-multilingual-v3.0` 1024\n* `embed-english-light-v3.0` 384\n* `embed-multilingual-light-v3.0` 384\n* `embed-english-v2.0` 4096\n* `embed-english-light-v2.0` 1024\n* `embed-multilingual-v2.0` 768"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update the model description for image embeddings

The model property's description in the Images schema at line 10693 refers to text embedding models, which may not be appropriate for image embeddings. Ensure that the description lists the correct image embedding models.

Apply this diff to correct the description:

-Defaults to embed-english-v2.0
-The identifier of the model. Smaller "light" models are faster, while larger models will perform better. [Custom models](/docs/training-custom-models) can also be supplied with their full ID.
-Available models and corresponding embedding dimensions:
-* `embed-english-v3.0`  1024
-* `embed-multilingual-v3.0`  1024
-* `embed-english-light-v3.0`  384
-* `embed-multilingual-light-v3.0`  384
-* `embed-english-v2.0`  4096
-* `embed-english-light-v2.0`  1024
-* `embed-multilingual-v2.0`  768
+Defaults to image-embed-v1.0
+The identifier of the image embedding model. Smaller "light" models are faster, while larger models will perform better.
+Available models and corresponding embedding dimensions:
+* `image-embed-v1.0` 512
+* `image-embed-light-v1.0` 256

Please adjust the model names and details according to the actual image embedding models available.

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
description: "Defaults to embed-english-v2.0\nThe identifier of the model. Smaller \"light\" models are faster, while larger models will perform better. [Custom models](/docs/training-custom-models) can also be supplied with their full ID.\nAvailable models and corresponding embedding dimensions:\n* `embed-english-v3.0` 1024\n* `embed-multilingual-v3.0` 1024\n* `embed-english-light-v3.0` 384\n* `embed-multilingual-light-v3.0` 384\n* `embed-english-v2.0` 4096\n* `embed-english-light-v2.0` 1024\n* `embed-multilingual-v2.0` 768"
description: "Defaults to image-embed-v1.0
The identifier of the image embedding model. Smaller \"light\" models are faster, while larger models will perform better.
Available models and corresponding embedding dimensions:
* `image-embed-v1.0` 512
* `image-embed-light-v1.0` 256"

- public
model:
type: string
description: "Defaults to embed-english-v2.0\n\nThe identifier of the model. Smaller \"light\" models are faster, while larger models will perform better. [Custom models](/docs/training-custom-models) can also be supplied with their full ID.\n\nAvailable models and corresponding embedding dimensions:\n\n* `embed-english-v3.0` 1024\n* `embed-multilingual-v3.0` 1024\n* `embed-english-light-v3.0` 384\n* `embed-multilingual-light-v3.0` 384\n\n* `embed-english-v2.0` 4096\n* `embed-english-light-v2.0` 1024\n* `embed-multilingual-v2.0` 768"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update default model to the latest version

In the Texts schema at line 10656, the model property's description states that the default is embed-english-v2.0. Considering that version 3.0 models are available, it's recommended to update the default model to the latest version for improved performance.

Apply this diff to update the default model:

-Defaults to embed-english-v2.0
+Defaults to embed-english-v3.0

Additionally, ensure that the API's default behavior aligns with this change.

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
description: "Defaults to embed-english-v2.0\n\nThe identifier of the model. Smaller \"light\" models are faster, while larger models will perform better. [Custom models](/docs/training-custom-models) can also be supplied with their full ID.\n\nAvailable models and corresponding embedding dimensions:\n\n* `embed-english-v3.0` 1024\n* `embed-multilingual-v3.0` 1024\n* `embed-english-light-v3.0` 384\n* `embed-multilingual-light-v3.0` 384\n\n* `embed-english-v2.0` 4096\n* `embed-english-light-v2.0` 1024\n* `embed-multilingual-v2.0` 768"
description: "Defaults to embed-english-v3.0\n\nThe identifier of the model. Smaller \"light\" models are faster, while larger models will perform better. [Custom models](/docs/training-custom-models) can also be supplied with their full ID.\n\nAvailable models and corresponding embedding dimensions:\n\n* `embed-english-v3.0` 1024\n* `embed-multilingual-v3.0` 1024\n* `embed-english-light-v3.0` 384\n* `embed-multilingual-light-v3.0` 384\n\n* `embed-english-v2.0` 4096\n* `embed-english-light-v2.0` 1024\n* `embed-multilingual-v2.0` 768"

Comment on lines +10685 to +10710
required:
- images
- model
- input_type
type: object
properties:
model:
type: string
description: "Defaults to embed-english-v2.0\nThe identifier of the model. Smaller \"light\" models are faster, while larger models will perform better. [Custom models](/docs/training-custom-models) can also be supplied with their full ID.\nAvailable models and corresponding embedding dimensions:\n* `embed-english-v3.0` 1024\n* `embed-multilingual-v3.0` 1024\n* `embed-english-light-v3.0` 384\n* `embed-multilingual-light-v3.0` 384\n* `embed-english-v2.0` 4096\n* `embed-english-light-v2.0` 1024\n* `embed-multilingual-v2.0` 768"
writeOnly: true
x-fern-audiences:
- public
input_type:
$ref: '#/components/schemas/EmbedInputType'
embedding_types:
type: array
items:
$ref: '#/components/schemas/EmbeddingType'
description: "Specifies the types of embeddings you want to get back. Not required and default is None, which returns the Embed Floats response type. Can be one or more of the following types.\n* `\"float\"`: Use this when you want to get back the default float embeddings. Valid for all models.\n* `\"int8\"`: Use this when you want to get back signed int8 embeddings. Valid for only v3 models.\n* `\"uint8\"`: Use this when you want to get back unsigned int8 embeddings. Valid for only v3 models.\n* `\"binary\"`: Use this when you want to get back signed binary embeddings. Valid for only v3 models.\n* `\"ubinary\"`: Use this when you want to get back unsigned binary embeddings. Valid for only v3 models."
writeOnly: true
x-fern-audiences:
- public
x-fern-sdk-group-name: v2
x-fern-audiences:
- public
V2EmbedRequest:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Define the images property in the Images schema

In the Images schema starting at line 10685, the images property is listed as required but is not defined in the properties section. This omission will lead to validation errors and incomplete API documentation.

Apply this diff to add the images property:

      properties:
+       images:
+         maxItems: 96
+         minItems: 1
+         type: array
+         items:
+           type: string
+           format: binary
+         description: An array of images (as binary data or URLs) for the model to embed. Maximum number of images per call is `96`.
+         writeOnly: true
+         x-fern-audiences:
+           - public
        model:
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
required:
- images
- model
- input_type
type: object
properties:
model:
type: string
description: "Defaults to embed-english-v2.0\nThe identifier of the model. Smaller \"light\" models are faster, while larger models will perform better. [Custom models](/docs/training-custom-models) can also be supplied with their full ID.\nAvailable models and corresponding embedding dimensions:\n* `embed-english-v3.0` 1024\n* `embed-multilingual-v3.0` 1024\n* `embed-english-light-v3.0` 384\n* `embed-multilingual-light-v3.0` 384\n* `embed-english-v2.0` 4096\n* `embed-english-light-v2.0` 1024\n* `embed-multilingual-v2.0` 768"
writeOnly: true
x-fern-audiences:
- public
input_type:
$ref: '#/components/schemas/EmbedInputType'
embedding_types:
type: array
items:
$ref: '#/components/schemas/EmbeddingType'
description: "Specifies the types of embeddings you want to get back. Not required and default is None, which returns the Embed Floats response type. Can be one or more of the following types.\n* `\"float\"`: Use this when you want to get back the default float embeddings. Valid for all models.\n* `\"int8\"`: Use this when you want to get back signed int8 embeddings. Valid for only v3 models.\n* `\"uint8\"`: Use this when you want to get back unsigned int8 embeddings. Valid for only v3 models.\n* `\"binary\"`: Use this when you want to get back signed binary embeddings. Valid for only v3 models.\n* `\"ubinary\"`: Use this when you want to get back unsigned binary embeddings. Valid for only v3 models."
writeOnly: true
x-fern-audiences:
- public
x-fern-sdk-group-name: v2
x-fern-audiences:
- public
V2EmbedRequest:
required:
- images
- model
- input_type
type: object
properties:
images:
maxItems: 96
minItems: 1
type: array
items:
type: string
format: binary
description: An array of images (as binary data or URLs) for the model to embed. Maximum number of images per call is `96`.
writeOnly: true
x-fern-audiences:
- public
model:
type: string
description: "Defaults to embed-english-v2.0\nThe identifier of the model. Smaller \"light\" models are faster, while larger models will perform better. [Custom models](/docs/training-custom-models) can also be supplied with their full ID.\nAvailable models and corresponding embedding dimensions:\n* `embed-english-v3.0` 1024\n* `embed-multilingual-v3.0` 1024\n* `embed-english-light-v3.0` 384\n* `embed-multilingual-light-v3.0` 384\n* `embed-english-v2.0` 4096\n* `embed-english-light-v2.0` 1024\n* `embed-multilingual-v2.0` 768"
writeOnly: true
x-fern-audiences:
- public
input_type:
$ref: '#/components/schemas/EmbedInputType'
embedding_types:
type: array
items:
$ref: '#/components/schemas/EmbeddingType'
description: "Specifies the types of embeddings you want to get back. Not required and default is None, which returns the Embed Floats response type. Can be one or more of the following types.\n* `\"float\"`: Use this when you want to get back the default float embeddings. Valid for all models.\n* `\"int8\"`: Use this when you want to get back signed int8 embeddings. Valid for only v3 models.\n* `\"uint8\"`: Use this when you want to get back unsigned int8 embeddings. Valid for only v3 models.\n* `\"binary\"`: Use this when you want to get back signed binary embeddings. Valid for only v3 models.\n* `\"ubinary\"`: Use this when you want to get back unsigned binary embeddings. Valid for only v3 models."
writeOnly: true
x-fern-audiences:
- public
x-fern-sdk-group-name: v2
x-fern-audiences:
- public
V2EmbedRequest:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant